Rigid geometry solves “curse of dimensionality” effects in clustering methods: An application to omics data

نویسنده

  • Shun Adachi
چکیده

The quality of samples preserved long term at ultralow temperatures has not been adequately studied. To improve our understanding, we need a strategy to analyze protein degradation and metabolism at subfreezing temperatures. To do this, we obtained liquid chromatography-mass spectrometry (LC/MS) data of calculated protein signal intensities in HEK-293 cells. Our first attempt at directly clustering the values failed, most likely due to the so-called "curse of dimensionality". The clusters were not reproducible, and the outputs differed with different methods. By utilizing rigid geometry with a prime ideal I-adic (p-adic) metric, however, we rearranged the sample clusters into a meaningful and reproducible order, and the results were the same with each of the different clustering methods tested. Furthermore, we have also succeeded in application of this method to expression array data in similar situations. Thus, we eliminated the "curse of dimensionality" from the data set, at least in clustering methods. It is possible that our approach determines a characteristic value of systems that follow a Boltzmann distribution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Coupled Rigid-viscoplastic Numerical Modeling for Evaluating Effects of Shoulder Geometry on Friction Stir-welded Aluminum Alloys

Shoulder geometry of tool plays an important role in friction-stir welding because it controls thermal interactions and heat generation. This work is proposed and developed a coupled rigid-viscoplastic numerical modeling based on computational fluid dynamics and finite element calculations aiming to understand these interactions. Model solves mass conservation, momentum, and energy equations in...

متن کامل

An Information Geometric Framework for Dimensionality Reduction

This report concerns the problem of dimensionality reduction through information geometric methods on statistical manifolds. While there has been considerable work recently presented regarding dimensionality reduction for the purposes of learning tasks such as classification, clustering, and visualization, these methods have focused primarily on Riemannian manifolds in Euclidean space. While su...

متن کامل

A H-K Clustering Algorithm For High Dimensional Data Using Ensemble Learning

Advances made to the traditional clustering algorithms solves the various problems such as curse of dimensionality and sparsity of data for multiple attributes. The traditional H-K clustering algorithm can solve the randomness and apriority of the initial centers of K-means clustering algorithm. But when we apply it to high dimensional data it causes the dimensional disaster problem due to high...

متن کامل

A Multi Linear Discriminant Analysis Method Using a Subtraction Criteria

Linear dimension reduction has been used in different application such as image processing and pattern recognition. All these data folds the original data to vectors and project them to an small dimensions. But in some applications such we may face with data that are not vectors such as image data. Folding the multidimensional data to vectors causes curse of dimensionality and mixed the differe...

متن کامل

Using k-Nearest Neighbor and Feature Selection as an Improvement to Hierarchical Clustering

Clustering of data is a difficult problem that is related to various fields and applications. Challenge is greater, as input space dimensions become larger and feature scales are different from each other. Hierarchical clustering methods are more flexible than their partitioning counterparts, as they do not need the number of clusters as input. Still, plain hierarchical clustering does not prov...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2017